On Fuzzy vs. Metric Similarity Search in Complex Databases
نویسندگان
چکیده
The task of similarity search is widely used in various areas of computing, including multimedia databases, data mining, bioinformatics, social networks, etc. For a long time, the database-oriented applications of similarity search employed the definition of similarity restricted to metric distances. Due to the metric postulates (reflexivity, non-negativity, symmetry and triangle inequality), a metric similarity allows to build a metric index above the database which can be subsequently used for efficient (fast) similarity search. On the other hand, the metric postulates limit the domain experts (providers of the similarity measure) in similarity modeling. In this paper we propose an alternative non-metric method of indexing for efficient similarity search. The requirement on metric is replaced by the requirement on fuzzy similarity satisfying the transitivity property with a tuneable fuzzy conjunctor. We also show a duality between the fuzzy approach and the metric one.
منابع مشابه
GEST: a gene expression search tool based on a novel Bayesian similarity metric
Gene expression array technology has made possible the assay of expression levels of tens of thousands of genes at a time; large databases of such measurements are currently under construction. One important use of such databases is the ability to search for experiments that have similar gene expression levels as a query, potentially identifying previously unsuspected relationships among cellul...
متن کاملIllllllll gera JZ 11111 illlll ] 11111 Hllll Determining Feature Weight of Pattern Classification by Using Rough Genetic Algorithm and Fuzzy Similarity Measure
The nearest neighbor (NN) methods solve classification problem by storing examples as points in a feature space, which requires some means of measuring distances between examples. However, it suffers from the existence of noisy attributes. One resolution is to modify the distance of similarity degree using attribute weights, which can not on]y decrease the influence of noisy attributes, but als...
متن کاملDynamic Similarity Metric Using Fuzzy Predicates for Case-Based Planning
Case-based planning (CBP) is a knowledge-based planning technique which develops new plans by reusing its past experience instead of planning from scratch. The task of CBP becomes difficult when the knowledge needed for planning can not be expressed precisely. In this paper, we tackle this issue by modeling imprecise information using fuzzy predicates; and accordingly, we present a dynamic simi...
متن کاملA multi-step strategy for approximate similarity search in image databases
Many strategies for similarity search in image databases assume a metric and quadratic form-based similarity model where an optimal lower bounding distance function exists for filtering. These strategies are mainly two-step, with the initial "filter" step based on a spatial or metric access method followed by a "refine" step employing expensive computation. Recent research on robust matching me...
متن کاملROBUSTNESS OF THE TRIPLE IMPLICATION INFERENCE METHOD BASED ON THE WEIGHTED LOGIC METRIC
This paper focuses on the robustness problem of full implication triple implication inference method for fuzzy reasoning. First of all, based on strong regular implication, the weighted logic metric for measuring distance between two fuzzy sets is proposed. Besides, under this metric, some robustness results of the triple implication method are obtained, which demonstrates that the triple impli...
متن کامل